Mon. Not. R. Astron. Soc. OOP. [TlfT3l C2008') Printed 19 Juno 2009 



(MN WI^ style file v2.2) 



Estimating the H I gas fractions of galaxies in the local 
Universe 



Wei Zhang^*, Cheng Li^'^, Guinevere Kauffmann^, Hu Zou^, Barbara Catinella^, 
Shiyin Shen^, Qi Guo^, Ruixiang Chang^ 



0^ 
O 
O 

OS 



< 

6 



> 
(N 

m 

(N 

(N 
O 

a^ 
o 



^ National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100012, China 
^ Max Planck Institut fiir Astrophysik, Karl-Schwarzschild-Strasse 1, 85748 Garching, Germany 



Key Laboratory for Research in Galaxies and Cosmology, Shanghai Astronomical Observatory, Nandan Road 80, Shanghai 200030, China 



Accepted Received ; in original form 



ABSTRACT 

We use a sample of 800 galaxies with H i mass measurements from the HyperLeda 
catalogue and optical photometry from the fourth data release of the Sloan Digital 
Sky Survey to calibrate a new photometric estimator of the H i-to-stellar mass ratio 
for nearby galaxies. Our estimator, which is motivated by the Kennicutt- Schmidt star 
formation law, is logio(Gi///S') = -1.73238(.g - r) + 0.215182/i, - 4.08451, where ft, 
is the i-band surface brighteness and g — r is the optical colour estimated from the g- 
and r— band Petrosian apparent magnitudes. This estimator has a scatter of ti = 0.31 
dex in \og{GHi/S), compared to cr ~ 0.4 dex for previous estimators that were based 
on colour alone. We investigate whether the residuals in our estimate of \og{GHi/S) 
depend in a systematic way on a variety of different galaxy properties. We find no effect 
as a function of stellar mass or 4000 A break strength, but there is a systematic effect 
as a function of the concentration index of the light. We then apply our estimator to 
a sample of 10^ emission-line galaxies in the SDSS DR4 and derive an estimate of the 
H I mass function, which is in excellent agreement with recent results from H i blind 
surveys. Finally, we re-examine the well-known relation between gas-phase metallicity 
and stellar mass and ask whether there is a dependence on H i-to-stellar mass ratio, as 
predicted by chemical evolution models. We do find that gas-poor galaxies are more 
metal rich at fixed stellar mass. We compare our results with the semi-analytic models 
of De Lucia & Blaizot, which include supernova feedback, as well as the cosmological 
infall of gas. 

Key words: galaxies: clusters: general - galaxies: distances and redshifts - cosmology: 
theory - dark matter - large-scale structure of Universe. 



1 INTRODUCTION 

The standard model of galaxy formation posits that galax- 
ies form when gas cools, condenses and forms stars at the 
centres of dark matter halos. In recent years, there has been 
an explosion of ground- and space-based surveys that have 
allowed astronomers to obtain imaging and spectroscopy for 
samples of many thousands of galaxies in the local Universe 
and at high redshifts. The vast majority of these surveys 
have probed the rest-frame optical, ultraviolet or infrared 
regions of the spectral energy distributions of the galaxies, 
and have thus provided important constraints on the prop- 
erties of the stars in these systems. Thanks to these surveys. 
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we have learned a huge amount about how the stellar masses 
and star formation rates of galaxies evolve with time. How- 
ever, if we are to understand how galaxies form, we also 
need to understand how gas is accreted by galaxies and the 
efficiency with which that gas is converted into stars. 

Our understanding of the cold gas content of galaxies 
lags considerably behind our understanding of their stellar 
populations. In the nearby Universe, new large surveys such 
as The Arecibo Legacy Fast ALFA (ALFALFA) Survey of H 
I, which will detect 25,000 extragalactic H I line sources out 
to z '-^ 0.06 using 3Q5m tele scope and seven-beam Arecibo 
L-band Feed Array (ALFA) (iGiovanelh et al.ll2005l '), wiU do 
much to redress the balance, but our poor knowledge of the 
atomic gas content of galaxies at high redshifts is likely to 
persist for many more years. For this reason, there have been 
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several recent attempts to calibrate colours or emission line 
equi valent widths as proxie s for the gas-to-stellar mass ratio. 

iTremonti l|2004l ') converted star formation surface 

densities (estimated from attenuation-corrected Ha lumi- 
nosities) to surface gas mass densities Spas, b y inverting 
the composite Schmidt law of iKennicuttI (|l998t ). These in- 
direct gas mass estimates were used (in conjunction with 
true gas measurements for a minority of the galaxies) to 
argue that the observed correlation between stellar mass 
and gas-phase metallicity could not be explained within 
the context of a closed-box chemical evolution model. The 
sa me techniq u e has been applied to high redshift galaxies 
by lErb et ahl (|2006l ) to interpret the redshift evolu t ion in 
the mass-metallicity relation, and by iBouche et al.l (|2007l ) 
to argue that high redshift galaxies lie on a "universal" star 
formation relation. Needless to say, the conclusions reached 
in these papers are only valid if the same Kennicutt-Schmidt 
law applies at high redshifts. 

Because the Ha line is not accessible in high redshift 
galaxies without near-infrared spectra , there have also been 
attempts to calibrate gas-to-stellar mass ratios using opti- 
cal or optical- infrared colours. The H I gas-to-stellar mass 
ratio, Ghi/S, has been found to correlate with optical (e.g. 
u — r) and optical- NIR (e.g. u — K) colours with a typical 
scatter of ~ 0.4 dex l|Kannappanll20o3 . hereafter K04). Since 
the stellar mass of a galaxy can also be estimated from its 
optical/NIR flu x if its redshift is known (see for example 
iBell et al]|2003l . hereafter BOS) , these correlations provide 
a way of estimating gas masses for large samples of galaxies 
where only photometry is available. Although such gas frac- 
tion estimates have large errors for individual galaxies, they 
may still be useful for statistical studies. 

In this paper we extend the analysis of K04 by examin- 
ing the correlation of Ghi/S with additional galaxy proper- 
ties, in the hope of finding an estimator of H I mass with less 
scatter. We first demonstrate that we reproduce the result of 
K04 if we use the same Ghi / S estimator and the same sam- 
ple selection criteria as in K04. We then extend the analysis 
by using a larger calibrating sample of galaxies with both 
photometry and H I masses, and we examine the correla- 
tions between Ghi/S and a large variety of parameters. In 
particular, we show that if we combine the observed colour 
with an estimate of the surface brightness of the galaxy, 
we can reduce the scatter in our estimates of Ghi/S by a 
substantial factor. Finally, we apply our best estimator to a 
large sample of star-forming galaxies selected from the SDSS 
DR4. We show that we recover the H I mass function as es- 
timated from the most recent blind H I surv ey data. We also 
re-exa mine the mass-metallicity relation of ITremonti et al] 
l|2004l ') and show that there are strong residuals in this rela- 
tion as a function of Ghi/S. 



2 DATA 

We use the Max Planck Institute for Astrophysics / Johns 
Hopkins University (MPA/JHU) SDSS DR4 databascQ as 
our parent sample. The sample comprises ~ 4 x 10^ ob- 
jects that have been spectroscopically confirmed as galax- 
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ies and have data publically av ailable in the SDSS DR4 
llAdelman-McCarthv et aU |2006|). Details can be fou nd in 
iKauffmann et al.l (120031 ) and iBrinchmann et all (|2004l '). We 
also make use of data from the HyperLeda homogenized H I 
catalogue (|Paturel et aLlbOOSl ) an d from the 2M ASS all-sky 
extended source catalogue (XSC: [Jarrett et al.|[200Q ). 

In order to make comparisons with K04, we first con- 
struct a sample of 721 galaxies (Sample I) from cross- 
matching the parent SDSS sample, the 2MASS XSC and 
the HyperLeda H I catalogue using the same selection cri- 
teria as in K04. The galaxies are required to have positions 
matched to HyperLeda objects withi n 6" (as in K04) and to 
2MASS XSC objects within 3" (as in lBlanton et al.l l|2005l )'). 
Following K04, the galaxies are also restricted to lie in the 
redshift range of z < 0.1, r < 17.77 and have K < 15, as 
well as reliable redshifts and magnitudes based on data fiags 
and errors (magnitude errors < 0.3 in K, < 0.4 in H I, and 
< 0.15 in u, g and r). Here it, g and r are the SDSS Pet- 
rosian apparent magnitudes, and K is the 2MASS A'— band 
extrapolated total magnitude. 

Our second sample (Sample II) consists of 800 galax- 
ies from the cross-match of the SDSS DR4 and HyperLeda 
H I data. The selection criteria are the same as above, ex- 
cept that the 2MASS-related criteria are not required here. 
In addition, we have visually examined the r-band image of 
each galaxy and dropped those galaxies that have compan- 
ion galaxies within 200 arcsecond^, to avoid mis-estimatimg 
the H I mass of the main galaxy. This sample will be used to 
derive our best estimator of the H I mass fraction. We note 
that the median redshift of the galaxies in both Sample I 
and Sample II is very low (~ 0.014; see Figure [S] below). 

Our third sample (Sample III) is based on SDSS DR4. 
Because the calibration sample (Sample II) is limited at 
z < 0.1, we apply the same redshift cut to the DR4 sample. 
We also apply the same magnitude cuts to both samples 
(r < 17.77). Finally, a galaxy is only included in Sample 
III if there is a significant detection of the Ha emission 
line in its optical spectrum, that is, if EW{Ha) > 3a. Here 
EW{Ha) is the equivalent width of the Ha emission line and 
(J is its error. Both quantities are taken from the MPA/JHU 
database. This gives rise to a sample of 157,662 galaxies. 
We will use this sample to estimate the H I mass function 
and compare it to the results from real H I surveys 14. ip . 
We note that more than 99% of galaxies in Sample II show 
EW{Ha) > 3a. 

Finally we construct a fourth sample (Sample IV), 
which is also based on SDSS DR4, and consists of 64,305 
star-forming galaxies with r < 17.77 and z < 0.1, high 
S/N emission lines (S/N > 3 for all the four emission lines 
on BPT (Baldwin-Phillips- Terlevich) diagram, see Brinch- 
mann et al. 2004 for details.) and reliable estimates of 
oxygen abundances (magnitude errors < 0.15 in the five 
SDSS photometric bands; metallicities in the range of 7.5 < 
12 + \og{0/ H) < 9.5 ). We use this sample to study whether 
the relation between stellar mass and gas-phase metallicity 
depends on the amount of gas in the galaxy (§133}. 

Throughout this paper we use stellar masses estimated 



^ We have also selected a smaller sample consisting of 129 galax- 
ies that have no companions within 10 arcminutes and found no 
significant change in the resulting H I gas estimator. 



© 2008 RAS, MNRAS OOO.nTOI 



Estimating the HI gas fractions of galaxies 3 



C/2 

O 

m 
o 



-1 



: All: 


CT 


= 0.45 


j- S-F: 


0" 


= 0.47 


i AGN: 


a 


= 0.41 








I 












All: CT = 0.37 
S-F: a = 0.38 
AGN: CT = 0.38 

I I 



' ' ' I 







2 



3 



3 4 5 



Figure 1. Correlation of the atomic gas-to-stellar mass ratio, G/S, with u — r (left panel) and u — K (right panel) colours, for 721 
galaxies in Sample I that are selected from the main sample of SDSS DR4 and also have data from HyperLeda and 2MASS. Galaxies 
are plotted in red dots if they are classified as AGN , blue dots if they are classified as high S/N star-forming galaxies, and open circles 
if they are unclass ified by iBrinchmann et al. 1 1 |2004) . Gas masses are derived from H I fluxes with a helium correction factor of 1.4 as in 
iKannappanl 1 20041). and stella r masses are derived from A"- band fluxes using stellar mass-to-light (M/L) ratios estimated from g — r 
colours as in iBell et al.l l l2003h . The three lines are the best- fitting linear relation (solid) and the Icr variance (dashed), which are shown 
in red for the AGN, in blue for the high S/N star-forming galaxies, and in black for the galaxies as a whole. The best-fit relations for 
the whole sample arc log(G/S) = 1.48 - 0.99(u - r) and log(G/5) = 2.19 - 0.58(u - K) with a = 0.45 and 0.37 dex, in good agreement 
with iKannappan. (|2004) who found log(G/S') = 1.46 - 1.06(n - r) and log(G/S) = 1.87 - 0.56(m - K) with a = 0.42 and 0.37 dex. 



from the i-band luminosity and g ~ r colour using the for- 
mula provided in BOS, that is, log(M./L,) = -0.222 -|- 
0. 864(3 — r). The surface brightness used here is defined as 
^li = rrii + 2.5 log(27r7?§o)) where rui is the apparent Pet- 
rosian i-band magnitude and -R50 is the radius (in units 
of arcsecond) enclosing 50% of the total Petrosian i-band 
flux. The stellar surface mass density is given by log(^,) = 
log(M*) — log(27r_R|Q), where M* is the stellar mass and 
-R50 is defined in the same way was as above but in units 
of kpc. The SDSS apparent magnitudes are corrected for 
foreground extinction and are fc— corrected to their z = Q 
value using the kcorrect v4.1.4 code of iBlanton et all 
l|2003ll . The 2MASS K-h&nA magnitude is fc-corrected us- 
ing k{z) = — 2.I2: (see BOS). Other paparameters such as star 
formation rate (SFR), oxygen abundance, and emission line 
fiuxes are tak en from the MPA/JHU da t abase. The reader 
is refe rred to Brinchmann et ahl l|2004 ) , iKauffmann et al] 
l|2003t ) and iTremonti et al.l l|2004 ) for detailed description 
of how these quantities are derived. We use the total SFR 
for which the aperture bias is corrected using resolved imag- 
ing. Throughout this paper we assume a cosmological model 
with 0,0 = O.S, Ao = 0.7, and Hq = 70 kms"^Mpc"^ 



3 ESTIMATING GAS MASSES 

3.1 Correlations of Ghi/S with galaxy properties 

In Fig. [1] we plot the correlation of the atomic gas-to-stellar 
mass ratio, G/S, with u — r (left panel) and u -~ K (right 
panel) colours for the 721 galaxies in Sample I. The atomic 
gas masses and the stellar masses plotted here are estimated 
in exactly the same way as in K04. Briefiy, gas masses are de- 



rived from H I fiuxes with a helium correction factor of 1.4, 
while stellar masses are from if— band fiuxes using stellar 
mass-to-light (M/L) ratios estimated from g — r colours as in 
BOS. The only difference from the analysis of K04 is the fact 
that our sample is based on a later SDSS data release. Using 
a sample of S46 galaxies constructed from cross-matching 
the SDSS DR2, the 2MASS XSC and the HyperLeda H i 
catalogue, K04 found log(G/S) = 1.46 - im(u - r) and 
log(G/5') = 1.87 - 0.56(u - K) with la variance a = 0.42 
and 0.S7 dex. As can be seen from Figure[T] the result of K04 
is well reproduced, both in the overall amplitude and slope of 
the mean correlations, and in the scatter about the mean. 
We also divide our sample into star-forming galaxies and 
AGN using the standard BPT classification d iagram (see 
IKauffmann et al]|200Sl : IBrinchmann et aL I I2OO4 for details) 
and plot these two classes using different colour codings. As 
can be seen, AGN and star-forming galaxies do not show 
significant difference in their correlation between gas frac- 
tion and colour. We therefore do not remove AGN from any 
of our samples, with the exception of sample IV, because we 
cannot obtain accurate gas-phase metallicity measurements 
for AGN. 

We then examine the correlations between Ghi / S and a 
variety of physical quantities, in the hope of finding a even 
better estimator for the gas fraction. It has long been es- 
tablished that the fraction of atomic gas correlates with a 
variety of galaxy properties. It would certainly not be sur- 
prising to find that a higher gas content is associated with 
galaxies with lower stellar masses, bluer colours, lower sur- 
face brightnesses, disk-dominated morphologies and spectral 
types indicative of the presence of a young stellar popula- 
tion. In our calibrating sample, Ghi/S exhibits the tightest 
correlations with colour, surface brightness and stellar sur- 
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Figure 2. Correl ation of the atomic gas-to-stellar mass ratio, G/S, with g — r colour (top left), aperture corrected specific star formation 
rate from Brinch mann et al.l l |200 4) (top right), i-band surface brightness (middle left), stellar surface mass density (middle right), T-type 
(bottom left, see text for more explanation), and concentration index (bottom right) for 800 galaxies in Sample II 



face mass density, with a typical scatter of 0.4 dex (see 
Fig. [2]). Interestingly, we find that the correlation between 
Ghi I S and i-band surface brightness or surface stellar mass 
density is much tighter than the correlation between Ghi / S 
and concentration index, which provi des a good me asure of 
the bulge-to-disk ratio of the galaxy l|Gadottill2009l ). 

We have also examined another ind icator of morphol- 
ogy, T-type l|de Vaucouleurs et al.|[l99ll ). We estimate this 
quantity using a back propagation neural network (BPNN). 
The basic methodology we employ (number of layers and 
neurons, the transfer function, the training algorithm etc) 
are exactly the same as described in lBaU et alTl|2004h . The 



training and test samples are drawn from the RC3 and con- 
sist of 439 3 galaxies with i magin g data from the SDSS DR4. 
We follow iFukugita et al.l (|2007l ) and use 29 photometric pa- 
rameters, which are available from SDSS, as the input for 
the BPNN. The inner structure of the network is adjusted 
iteratively until an optimally trained network that links T 
and the available set of photometric parameters is obtained. 
The network is then applied to the galaxies in our sample 
with no T measurement. As can be seen from the figure, the 
correlation of Ghi/S and T-type is also weaker than the 
correlation with surface mass density or surface brightness. 

We find that Ghi/S correlates very strongly with 
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galaxy colour and reasonably strongly with specific star 
fo rmation rate (corrected for aperture affects as described 
in iBrinchmann et al.|[2004h . We also find correlations with 
other quantities, but since all of them exhibit considerably 
more scatter, we do not show them here. 

3.2 Deriving new Ghi/S estimators 

In this paper, we will only concern ourselves with our two 
strongest correlations: i.e the correlation between Ghi/S 
and colour or specific star formation rate and the correla- 
tion between Ghi / S and surface brightness or stellar surface 
mass density. How can these correlations be understood? 
Let us co nsider the Kennicutt-Schmidt law of star forma- 
tion (| Schm idt 1963; Kennicutt 1998) in which the surface 
density of star formation rate scales with the surface den- 
sity of total (atomic -I- molecular) cold gas as an increasing 
power law, 

EsFfl oc Eg„s, (1) 

with a slope of n ~ 1.4. If the two surface densities are 
estimated over the same radius, one can easily rewrite the 
Kennicutt- Schimit law as follows: 

SFR/M, oc (G/S)'Vr\ (2) 

where G/S = Algas/M^, and Mgas and M, are the total 
mass of cold gas and stars. It is thus natural to expect the 
gas mass-to-stellar mass ratio to correlate with these proper- 
ties, just as shown in Figure[2l More interestingly, the above 
equation implies that there exists a plane of correlations, 
similar to the fundamental plane of early-type galaxies, in- 
volving the following three variables: the gas-to- stellar mass 
ratio (G/S), the specific star formation rate [SFR/M^) and 
the surface stellar mass density (/i,). This plane can be de- 
fined by 

log G/S = log Mgas/M, ^alogfi,+b log SFR/M, + c, (3) 

where the coefficients a, b, and c can be determined by min- 
imizing the residuals from the plane. If the gas content pa- 
rameter G/S that enters this plane scales linearly with the 
observed H l-to stellar mass ratio (we admit that this may 
well be an over-simplification of the true situation), then 
log Ghi / S can also be expressed as a linear combination of 
the logarithm of the stellar surface mass density and the 
logarithm of the specific star formation rate. 

Our best-fit relation between Ghi/S and the linear 
combination of and SFR/AI, (from Sample II) is shown 
in the far-right panel of Figure [3] The galaxies are plotted 
as black dots, and the best-fit relation is shown as a solid 
line. The la scatter around the relation is 0.33 dex, which is 
significantly smaller than the scatter in the relation between 
log Ghi / S and stellar surface mass density or the relation 
between log Ghi / S and specific star formation rate. 

This relation is also tighter than the correlations of 
Ghi / S with all the other galaxy properties we have investi- 
gated. Thus, this relation can serve as a better estimator of 
H l-to-stellar mass ratio, compared to those derived using a 
single parameter. 

We note that the correlation between colour and Ghi / S 
shown in Figure [2] is actually somewhat tighter than the 
correlation between SFR/Af* and Ghi/S. One reason for 



this may be that the Brinchmann et al ( 2004) aperture cor- 
rectio ns are not accurate. As shown in IBrinchmann et ahl 
|2003) (see their Fig. 14), for log(SFJ?/M.) > -10.5, the 2a 
uncertainties on the aperture-corrected log SFRs are larger 
by ~ 0.3 dex than for log SFR measured inside the fibre 
because the aperture corrections are significantly more un- 
certain than the SFR estimates from the spectra. The errors 
will be even larger for the galaxies in our sample, because 
the median redshift is much lower and the total star forma- 
tion rates are derived almost entirely from the photometry, 
not from the emission lines measured in the spectra. 

In addition, we find the correlation between Ghi / S and 
surface brightness is just as tight as the correlation between 
Ghi/S and stellar surfac e mass density, so t here is no real 
value in working with the IBrinchmann et al.l (120041 ) star for- 
mation rate estimates , rather than with directly measured 
colours. For our purposes, this is in fact quite encouraging, 
because it suggests that one can get a reasonably good pre- 
diction for the fraction of atomic gas in a galaxy from pho- 
tometrically measured quantities. This is most frequently 
available for high redshift galaxies. 

We propose that the {g — r) colour and i-band surface 
brightness Hi are good choices for this purpose. The corre- 
lation plane that uses the quantities {g — r) and /x* is shown 
in the leftmost panel in Figure |31 and the plane that uses 
the quantities {g — r) and fj,i is shown in the central panel. 
The scatters in the relations are the same (0.31 dex), even 
smaller than the plane involving SFR/h'h and /x*. We thus 
adopt the relation in the central panel, 

logio(Gff//S') = -1.73238(g-r)+0.215182At» -4.08451, (4) 

as our final estimator of Ghi/S. The scatter in our new 
estimator represents a 20% decrease as compared in that of 
K04. 

Before we go ahead and apply this estimator to large 
samples or try to extrapalate it to higher redshifts, it is im- 
portant to test whether we can find any systematic effects 
that could bias such analyses. For example, we know that 
pure passive aging of stellar populations will cause the colour 
of a galaxy to evolve with time. It would therefore not be 
surprising if Ghi /S at a fixed value of the g — r colour was 
to depend on redshift. In the absence of H I data for high 
redshift galaxies, the only way we can test for such effects 
is to see whether the residuals around our best-fit estimator 
are correlated with intrinsic properties of the galaxies, such 
as stellar mass, morphology or mean stellar age. In Figure H) 
we plot the residuals of Ghi/S from the relation given in 
Eq.Q as a function of stellar mass, concentration index and 
4000 A break strength. There is no clear tendency for the 
Ghi / S residuals to correlate with stellar mass or with 4000 
A break strength. There is a small, but significant trend in 
the residual as a function of the concentration index, in the 
sense that our estimator overpredicts the gas fraction for the 
least concentrated galaxies and underpredicts for the most 
concentrated objects. The effect is not large — ~ 0.2 dex 
shift in the predicted value of log Ghi / S from the least con- 
centrated to the most concentrated galaxies in our sample. 
We do not see significant trends in the scatter in the resid- 
uals as a function of any galaxy property, suggesting that it 
may be possible to calibrate out such systematic trends in 
the future. 
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Figure 3. Wo plot the relations between Ghi/S and the linear combinations of, a) g — r colour and stellar surface mass density (left), 
b) q — r colour and i-band surface brightness (middle), and c) aperture corrected specific star formation rate from ISrinchmann et al.l 
(|200J) and stellar surface mass density, that minimize the scatter. 
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Figure 4. We plot the residual in the predicted value of Ghi/S from equation (4) as a function of stellar mass (top left), concentration 
index (middle left) and 4000 A break strength (bottom left). We have also divided the galaxies into 8 subsamples as indicated by the 
coloured dots (left panels) or lines (right panels). The left panels show the residual for individual galaxies, while the right panels show 
the distribution of the residuals in each of the subsamples. 
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Figure 5. Distributions of redshift, i-band absolute magnitude, g — r colour, i-band surface brightness, concentration parameter Rgo/Rso 
measured in i-band, and H I gas mass, for galaxies in our calibration sample (Saunple II, solid line) and in Sample III (dotted line). In 
each panel the maximum of the two curves is set to unity and the two curves are normalized so that they enclose the same area. 



We caution that our calibrating sample is compiled from 
many different samples with different selection effects, and 
may thus be biased because it is not a truly representa- 
tive sample of nearby galaxies. This can be seen from Fig- 
ure [5] where we plot the histograms of a variety of physical 
properties for galaxies in Sample II. It will be important to 
re-examine these issues with larger and more homogeneous 
galaxy samples, as will be provided by the ALFALFA survey. 



4 APPLICATIONS 



We now illustrate two different applications of our Ghi/S 
indicator. We first use the colours and surface brightnesses of 
the galaxies in Sample III to estimate the H I mass function 
and we compare our estimate with recent results from real H 
I surveys. We then examine whether residuals in the relation 
between gas phase metallicity and stellar mass are correlated 
with the H I content of galaxies. 
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log(M„,/M<3) log(M„,/Mj 



Figure 6. Left: H I mass function for the galaxies in the SDSS DR4 that have redshift below 0.1 and evident H« emission in their optical 
spectrum is plotted as symbols, compared to the result from previous study which is based on real H I observation and is plotted as line. 
Center: H I mass function is plotted for red/blue galaxies separately. The result for the full sample is repeated from the left panel. Right: 
redshift distributions for red and blue galaxies as well as for the galaxies as a whole are plotted in the upper part. The lower part shows 
the redshift distribution for galaxies in the full sample, but at three fixed stellar masses as indicated. 



4.1 Hi Mass Function 

For each galaxy i in Sample III we compute the quan- 
tity Zmax,i, whlch is defined as the maximum redshift at 
which the galaxy would satisfy the apparent magnitude 
limit of the sample, k-corrections arc included when calcu- 
lating Zmax.i as describe d in Blanton ct al. (20o3) (see also 
iBlanton fc Rowei3l2007l ) . We do not apply evolutionary cor- 
rections, because our sample is constrained to lie at z < 0.1. 
Vmax.i is defined for each galaxy as the comoving volume of 
the survey out to redshift Zmax,i-, or 0.1 if Zmax.i > 0.1. The 
H I mass function is thus estimated as 



icant l y smaller samples llZwaan et al.l [TqQTI: iHenning et al] 



^{Mhi)^Mhi 



(5) 



where fsb,i is the spectroscopic completeness of the survey 
area where the galaxy i is located and is defined as the frac- 
tion of the photometrically targeted galaxies in the area for 
which usable spectra were obtained. The sum in the above 
equation extends over all galaxies with If I mass in the range 
MffjiO.SAMff/. 

Figure |6] shows the H I mass function for galaxies in 
Sample III. The error bars are estimated using the boot- 
strap resampling technique (|Barrow et al.|[T984h . We gen- 
erated 100 bootstrap samples from Sample III and com- 
puted the H I mass function for each sample. The errors are 
then given by the scatter of the mass function among these 
samples. We would like to point out that these errors are 
underestimated, because they account only for the uncer- 
tainties due to sampling variance, but do not include the 
effect of cosmic variance, systematic uncertainties in the es- 
timates of stellar masses, and scatter in the estimates of H I 
mass. For comparison, we also plot in F igure [6] a recent mea - 
surement of the H I mass function bv lZwaan et al. I l|2005l ). 
based on a complete H i-selected sample of 4315 galax- 
ies from the HI Parkes All Sky Survey (HIPASS). HIPASS 
achieves 100% coverage over the southern sky and the sam- 
ples used for calculating the H I mass functions lie in the 
redshift range 0.003 < z < 0.02. Most other determina- 
tions of the H I mass function have been based on signif- 



I2OOOI : iRosenberg fc Schneide j |2002| : IZwaan et al.ll2003l ). or 

have been based on samples containin g only specific mor- 
phological types (|Springob et al■|[2005^ . 

Our derived H l mass fu nction is in excellent agreement 
with the lZwaan et al.l (|2005l ) result and is reasonably well fit 
by a sin gle Schechter functio n with parameters close to those 
given bv lZwaan et al.l(|2005l ). A comparison of our measured 
H I mass function with their best-fit Schechter function gives 
a reduced statistic of /d.o.f = 24/12. Their parameters 
yield slightly fewer galaxies at the low-mass end. A straight- 
forward integration of our H I mass function gives the mean 
comoving H I mass density of the low-redshift Universe of 
pHi ^ 7.5 X IQ^ HMq /yipc^ , corresponding to a cosmolog- 
ical H I mass d ensity of Q,hi ^ 2.7 x 10~*/i~^, again well 
consistent with IZwaan et al.l (120051). In the standard con- 
cordance cosmology (e.g. Komatsu et al.l [20091 ) this matter 
density corresponds to only ~ 0.8% of the baryons in the 
low-redshift Universe in H I ga s in galaxies, compared to 
3.5% in stars (|Li fc Whitell2009l ). 

It is also interesting to compare the relative abundance 
of red and blue galaxies at given H I mass. We split the 
sample into two colour bins using the following, luminosity- 
dependent cut: 



(5 - r) = -0.104 - 0.042Mr 



(6) 



We then compute the H I mass functions for the red and blue 
subsamples separately and plot them in the right-hand panel 
of Figure [S] The result for the full sample is also plotted. As 
can be seen, the H I mass function is dominated by blue 
galaxies at all masses. 

The redshift distribution of the red/blue subsamples as 
well as that of the full sample is shown in the far-right panel 
in Figure [§] The distribution of other physical properties 
of the full sample is plotted in dashed lines in Figure [S] 
The redshift distributions show strong features which reflect 
large-scale structure. The structure at z ~ 0.08 is the well- 
known super cluster in SDSS, the Sloan Great Wall. Two 
other weeker structures are also seen at redshifts around 0.01 
and 0.03. These structures are likely the reason why our H 
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log (M.+Mhi) log (M.+Mhi) 

Figure 7. Correlation of metallicity with stellar mass (top) and with stellar mass plus H I gas mass (bottom). The left-hand panels show 
results for a sample of 800 galaxies from the HyperLeda catalogue for which the H I mass is actually measured. The right hand panels 
show results for 1.4 X 10* star-forming galaxies in SDSS DR4 for which the H I mass is given by our estimator based on the i-band surface 
brighteness and g — r colour. Each pixel in the mass-metallicity plane has been colour-coded according to the mean H I gas fraction of 
the galaxies that fall in that region. The green line in each panel shows the 2nd-order polynomial function that provides the best fit to 
the mean relation. 



I mass function shows slightly higher (but still within error 
bars) amplitude than that of Zwaan et al. (2005), at masses 
below ^ 10^ Mq and at those around the characteristic mass 
(IO'^-^^Mq). However, given the large uncertainties in our H 
I mass estimates and the small sample size of Zwaan et al. 
(2005), such discrepancies should not be overemphasised. 

4.2 The Mass-Metallicity Relation 

The rela tionship between mass and m etallicity (MZR; 
iLequeux e t al. 1979; Trem onti et al.|[2004l ) is of particular in- 
terest in studies of gala xy formation and evolution. There is 



now observational (e.g. lGarnettll2002l : iTremonti et al 



evidence that metal loss via galactic winds IjLarso: 



al] 
jnl 



2004 ) 



19741 ) 



may be largely responsible for driving the relation. However, 
other processes may also be important. For e xample, low- 
mass galaxies have higher gas fractions (e.g. iBoselli et al.l 
|2002| ) and are hence expected to be less enriched even if they 
were to evolve as closed boxes. It has even been proposed 
that a variable I MF could a lso produce a mass-metallicity 
relation (Kop pen et al.ll2007h . 

In addition, the mass-metallicity relation has been 
found to depend on other properties of the galax- 



ies in the sample (for example surface mass density 
Tremonti et al.ll2004| . star formation rate as measured by 



ultra violet-luminosity and surface brightness iHoopes et al 



Ellison et al 



200^, specific star formation rate and size 
200 ^1, the presence or ab sence of close companions [e.g. 
Michel-Dansac et al. |2008| , cluster membership and local 
density Ellison et al. '2009] and also on large-scale environ- 
ment [e.g. Cooper et al., 2008]). It would be useful to figure 
out which of these dependencies are primary causes of vari- 
ation in the relation, and which are secondary effects. For 
example, if the mass-metallicity relation depends on specific 
star formation rate, it will naturally also depend on envi- 
ronment, simply because the specific star formation rates of 
galaxies depend strongly on density. 

The gas mass fraction is the natural parameter for quan- 
tifying the degree to which a galaxy has exhausted its avail- 
able fuel supply and one would expect the mass-metallicity 
relation to depend quite strongly on this quantity. In Fig- 
ure [T] we examine the trend of gas mass fraction, defined 
as the ratio of H I gas mass to the sum of H I and stellar 
mass, with the location of galaxies in the plane of metal- 
licity versus stellar mass (top) and metallicity versus H I 
mass plus stellar mass (bottom). In each case, we compare 
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Figure 8. Left: H I gas mass fraction versus stellar mass relation for metal-rich (red) and metal-poor (blue) galaxies, for a small sample 
of galaxies that have real H I observations. The three small panels compare the histogram of H I gas fraction for metal-rich (red solid) and 
metal-poor (blue dashed) galaxies in three stellar mass intervals as indicated. Right: contours of number density of galaxies in the plane 
of H I gas mass fraction versus stellar mass, for a large sample of star-forming galaxies from the SDSS DR4 with H I mass estimated from 
photometry. Red and blue lines are for metal-rich and metal-poor galaxies respectively. The contour levels, which indicate the fraction 
of galaxies in the two subsamples that fall within a given region of the plane, are decreased by factors of 4 from the highest (0.016 [0.2 
logj^Q Mq\~^ [0.05 dex]~^) to the lowest (6. 25x10"^ [0.2 logjQ Mq]~^ [0.05 dex]~l). The region enclosed by the highest-level contour is 
shaded in red (blue) for the metal-rich (-poor) population. 



results for our small calibrating sample with real H I masses 
(Sample II, left-hand panels) with results obtained for all 
high S /N star- forming galaxies (Sample IV, right-hand pan- 
els). As can be seen, the basic trend is very similar for the 
two samples. Star-forming galaxies with lower masses and 
metallicities have higher gas fractions. 

In addition. Figure [7] shows that at fixed stellar mass, 
metal poor galaxies have higher H I fractions than metal- 
rich galaxies. This is clearly seen in Figure [8] where we plot 
the gas fraction — stellar mass relation for galaxies in Sample 
II and Sample IV. Metal-poor galaxies are plotted in blue, 
while metal-rich galaxies are plotted in red. In order to split 
our sample into two subsamples in metallicity, we determine 
a stellar mass-dependent cut in metallicity by fitting a 2nd 
order polynomial function to the stellar mass-metallicity re- 
lation shown in the top two panels of Figure [7] A galaxy lo- 
cated above (below) this cut is thus classified as metal-rich 
(-poor). As can be seen from Figure [S] metal-poor galaxies 
are shifted almost vertically in this diagram, towards higher 
H I fractions. This is true for both our calibrating sample 
and for the full sample of star-forming galaxies. One possible 
interpretation of this result, is that the scatter of metallicity 
at fixed stellar mass might be partially (if not totally) due 
to recent inflow of less enriched gas from the surrounding 
halo. 

In order to test whether this explanation is plausible, 
one requires a model for the chemical enrichment of galax- 
ies that takes into account the effect of supernova-driven 
winds and the infall of gas from the surrounding hal o . Such 
models already exist. For example, iDe Lucia et al.l (|2004l ) 
implemented a model for the chemical enrichment of galax- 
ies in a high resolution simulation of a ACDM universe. The 
transport of metals between the stars, the cold gas in galax- 



ies, the hot gas in dark matter haloes and the intergalactic 
gas outside virialized haloes was modelled in detail. In the 
scheme adopted by the authors, metals are ejected outside 
the halo by supernovae and later reincorporated when struc- 
ture collapses on larger scales. The model also followed the 
continued infall of cold gas onto the galaxy through merging 
of gas-rich satellites and through cooling from a surround- 
ing hot gas halo. After suitable adjustments to the free pa- 
rameters in the model, a good fit to the observed relations 
between stellar mass, gas mass and metallicity was obtained. 



We now test whether these same models can repro- 
duce the secondary trend with gas fraction that we see in 
our data. In Figure [9l we plot the stellar mass-gas frac- 
tion r elation predict ed by the semi-analytic model of of 
iDe Lu cia & Blaizot (|2007l ). whi ch incorporates the same 
chemical enrichment scheme as in iDe Lucia et"al] (|2004l ). We 
have selected all galaxies in the z=0 catalogue (available at 
|http://www.mpa-garching.mpg.de/millennium I with spe- 
cific star formation rates \og^f^{SFRfMt) > —11. Note that 
in Sample IV 99% galaxies have specific star fromation rate 
above this value. As in the previous figure, the galaxies are 
divided into two types, "metal-poor" and "metal-rich" by 
defining a stellar mass-dependent cut in exactly the same 
way as was done for the sample of observed galaxies. As 
can be seen, the models do predict a displacement of the 
metal-poor with respect to the metal-rich subsamples that 
is qualit atively reminiscent of the trend seen in the real data 
(see also lde Rossi et al]|2009l . for a similar finding for Milky 
Way type systems in the Millennium Simulation). However, 
the quantitative agreement is not so good. In particular, 
in the simulations, the two populations are displaced more 
along the x-axis (i.e. in stellar mass) than along the y-axis 
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Figure 9. Gas mass fraction versus stellar mass relation for 
motal-rich (red) and metal-poor (blue) ga l axies in the 2 = semi- 
analytic catalogue of lDe Lucia &: Blaizoti ||2007|) . The meaning of 
the colour coding and contour levels is the same as in the right- 
hand panel of the previous figure. 



(i.e. in gas fraction), whereas the opposite is seen in the real 
data. 

This may indicate the efficiency with which super- 
novae are assumed to expel gas from galaxies is too high 
and that the mass-metallicity trends in the real data are 
more driven b y variations in cold gas content than in the 
iDe Lucia &: Blaizot (2007) models. There is a free param- 
eter in the models that controls the efficiency with which 
energy from supernova couples to the gas, and as this in- 
creases, the mass-metallicity relation steepens. Variations 
in gas fraction are determined by cooling and infall, which 
should be modeled more accurately. 

One issue that we have swept under the carpet until 
now, is that we are assuming that the total gas content of the 
galaxy simply scales in proportion to the H I mass, thereby 
neglecting any intrinsic variations in the ratio of atomic-to- 
molecular gas in galaxies. If this ratio wer e to depend sys- 
tema tically on metallicity (see for example iKrumholz et al.] 
|2008| ) , then part of the vertical displacement in H I fraction 
between metal-poor and metal-rich galaxies may simply be 
due to this effect. 



5 SUMMARY 

We have examined the correlation of atomic hydrogen-to- 
stellar mass ratio, Ghi/S, for a sample of 800 galaxies that 
have H l flux measurement from the HyperLeda catalogues 
and optical photometry from the SDSS DR4. We use this 
sample to derive a new estimator of H I mass fraction for 
low-redshift galaxies {z < 0.1). This is \og^Q{GHi / S) = 
-1.73238(g - r) + 0.215182/ii - 4.08451, where fn is the 



z-band surface brightness and {g — r) is the optical colour 
given by the g- and r-band Petrosian magnitudes. The typi- 
cal scatter in the relation is 0.31 dex. We have tested whether 
the residuals in our Ghi/S estimator correlate with galaxy 
properties. We find no effect as a function of stellar mass or 
as a function of mean stellar age as measured by the 4000 
A break, but there is a small effect as a function of galaxy 
concentration. 

We then apply this new estimator to a large sample of 
star-forming galaxies from the SDSS DR4 to estimate the H 
I mass function, and we find good agreement with determi- 
nations from recent H I surveys. This demonstrates that our 
estimator does, at least in a statistical sense, properly repro- 
duce the distribution of H I mass in the local Universe. We 
have also used the data to examine whether the stellar mass- 
metallicity relation of galaxies depends on gas content, and 
we found a systematic change in gas fraction along this rela- 
tion. In addition, at fixed stellar mass, galaxies with higher 
metallicities tend to contain less gas. 

Finally, we would like to discuss how this work could 
be improved in future. First, the calibration sample (Sample 
II) is taken from HyperLeda, which is an inhomogenous col- 
lection of data from different observational programs, each 
with different selection effects. It is not clear to what ex- 
tent our calibrating sample is fully representative of the lo- 
cal galaxy population. The fact that we do see some sys- 
tematic bias as a function of concentration index, suggests 
that as the bulge component becomes more prominent, our 
estimator, which is motivated by the Kennicutt-Schmidt 
law for galactic disks, may become increasingly inaccurate. 
We intend to test this using the larger and deeper sam- 
ples that will soon become available from surveys such as 
ALFALFA and The G ALEX Arecibo SDSS Survey (GASS) 
(|Catinella et al.|[2008l ) . Another problem is that our calibrat- 
ing sample lies at very low redshifts, so that any quantity 
that is measured from the SDSS spectra, such as metallic- 
ity, is heavily weighted to the central regions of the galax- 
ies. We have skirted around this problem by dividing the 
galaxy population into two metallicity bins containing equal 
number of galaxies and carrying out a relative comparison 
between the gas fractions of the high and low-metallicity 
sub-samples at fixed stellar mass. This means that aperture 
bias are at least roughly the same for our two sub-samples. 
Nevertheless, it should be borne in mind that when we re- 
fer to metallicity in this paper, we are not talking about a 
global-average quantity. 

We believe that it will be most interesting to apply 
our new indicators to galaxy samples where there is little 
prospect getting real H I mass measurements in the foresee- 
able future, for example at high redshift. The indicator may 
also be useful for applications where one would like to have 
some rough estimate of the cold gas available to fuel star for- 
mation, for example in satellite galaxies that are destined to 
merge with their parent object on some short timescale. We 
will be looking into such applications in our future work. 
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